Evolutionary rule-based systems for imbalanced data sets
نویسندگان
چکیده
This paper investigates the capabilities of evolutionary online rule-based systems, also called Learning Classifier Systems (LCSs), for extracting knowledge from imbalanced data. While some learners may suffer from class imbalances and instances sparsely distributed around the feature space, we show that LCSs are flexible methods that can be adapted to detect such cases and find suitable models. Results on artificial datasets specifically designed for testing the capabilities of LCSs in imbalanced data show that LCSs are able to extract knowledge from highly imbalanced datasets. When LCSs are faced with real-world problems, they demonstrate to be one of the most robust methods compared with instance-based learners, decision trees and support vector machines. Moreover, all the learners benefit from resampling techniques. Although there is not a resampling technique that performs best in all datasets and for all learners, those based in oversampling seem to perform better in average. The paper adapts and analyses LCSs for challenging imbalanced datasets and sets the bases for further studying the combination of resampling techniques plus learner best suited to a specific kind of problem.
منابع مشابه
On Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملارائهروش جدید مبتنیبر برنامهنویسی ژنتیک برای وزندهی قوانین فازی در طبقهبندی نامتوازن
In classification problems, we often encounter datasets with different percentage of patterns (i.e. classes with a high pattern percentage and classes with a low pattern percentage). These problems are called “classification Problems with imbalanced data-sets”. Fuzzy rule based classification systems are the most popular fuzzy modeling systems used in pattern classification problems. Rule weights...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملImproving the Performance of Fuzzy Rule Based Classification Systems for Highly Imbalanced Data-Sets Using an Evolutionary Adaptive Inference System
In this contribution, we study the influence of an Evolutionary Adaptive Inference System with parametric conjunction operators for Fuzzy Rule Based Classification Systems. Specifically, we work in the context of highly imbalanced data-sets, which is a common scenario in real applications, since the number of examples that represents one of the classes of the data-set (usually the concept of in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Soft Comput.
دوره 13 شماره
صفحات -
تاریخ انتشار 2009